Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
丹尼德缩放结束和摩尔法的放缓使能量使用数据中心在不可持续的道路上。数据中心已经是全球电力使用的大部分,应用需求以快速缩放。我们认为,数据中心计算的碳强度的大幅减少可以通过以软件为中心的方法来实现:通过修改系统API,通过修改系统API来使应用程序开发人员可见的能量和碳,使其成为可能进行知情的贸易性能和碳排放之间,并通过提高应用程序编程水平,以便灵活地使用更节能的计算和存储方法。我们还为系统软件奠定了一个研究议程,以减少数据中心计算的碳足迹。
translated by 谷歌翻译
We present Spider, a large-scale, complex and cross-domain semantic parsing and textto-SQL dataset annotated by 11 college students. It consists of 10,181 questions and 5,693 unique complex SQL queries on 200 databases with multiple tables, covering 138 different domains. We define a new complex and cross-domain semantic parsing and textto-SQL task where different complex SQL queries and databases appear in train and test sets. In this way, the task requires the model to generalize well to both new SQL queries and new database schemas. Spider is distinct from most of the previous semantic parsing tasks because they all use a single database and the exact same programs in the train set and the test set. We experiment with various state-of-the-art models and the best model achieves only 12.4% exact matching accuracy on a database split setting. This shows that Spider presents a strong challenge for future research. Our dataset and task are publicly available at https://yale-lily. github.io/spider.
translated by 谷歌翻译
Visual object tracking under challenging conditions of motion and light can be hindered by the capabilities of conventional cameras, prone to producing images with motion blur. Event cameras are novel sensors suited to robustly perform vision tasks under these conditions. However, due to the nature of their output, applying them to object detection and tracking is non-trivial. In this work, we propose a framework to take advantage of both event cameras and off-the-shelf deep learning for object tracking. We show that reconstructing event data into intensity frames improves the tracking performance in conditions under which conventional cameras fail to provide acceptable results.
translated by 谷歌翻译
Both clustering and outlier detection play an important role for meteorological measurements. We present the AWT algorithm, a clustering algorithm for time series data that also performs implicit outlier detection during the clustering. AWT integrates ideas of several well-known K-Means clustering algorithms. It chooses the number of clusters automatically based on a user-defined threshold parameter, and it can be used for heterogeneous meteorological input data as well as for data sets that exceed the available memory size. We apply AWT to crowd sourced 2-m temperature data with an hourly resolution from the city of Vienna to detect outliers and to investigate if the final clusters show general similarities and similarities with urban land-use characteristics. It is shown that both the outlier detection and the implicit mapping to land-use characteristic is possible with AWT which opens new possible fields of application, specifically in the rapidly evolving field of urban climate and urban weather.
translated by 谷歌翻译
Social media and messaging apps have become major communication platforms. Multimedia contents promote improved user engagement and have thus become a very important communication tool. However, fake news and manipulated content can easily go viral, so, being able to verify the source of videos and images as well as to distinguish between native and downloaded content becomes essential. Most of the work performed so far on social media provenance has concentrated on images; in this paper, we propose a CNN architecture that analyzes video content to trace videos back to their social network of origin. The experiments demonstrate that stating platform provenance is possible for videos as well as images with very good accuracy.
translated by 谷歌翻译
Machine Learning models capable of handling the large datasets collected in the financial world can often become black boxes expensive to run. The quantum computing paradigm suggests new optimization techniques, that combined with classical algorithms, may deliver competitive, faster and more interpretable models. In this work we propose a quantum-enhanced machine learning solution for the prediction of credit rating downgrades, also known as fallen-angels forecasting in the financial risk management field. We implement this solution on a neutral atom Quantum Processing Unit with up to 60 qubits on a real-life dataset. We report competitive performances against the state-of-the-art Random Forest benchmark whilst our model achieves better interpretability and comparable training times. We examine how to improve performance in the near-term validating our ideas with Tensor Networks-based numerical simulations.
translated by 谷歌翻译
Soft robots are interesting examples of hyper-redundancy in robotics, however, the nonlinear continuous dynamics of these robots and the use of hyper-elastic and visco-elastic materials makes modeling of these robots more complicated. This study presents a geometric Inverse Kinematic (IK) model for trajectory tracking of multi-segment extensible soft robots, where, each segment of the soft actuator is geometrically approximated with multiple rigid links connected with rotary and prismatic joints. Using optimization methods, the desired configuration variables of the soft actuator for the desired end-effector positions are obtained. Also, the redundancy of the robot is applied for second task applications, such as tip angle control. The model's performance is investigated through simulations, numerical benchmarks, and experimental validations and results show lower computational costs and higher accuracy compared to most existing methods. The method is easy to apply to multi segment soft robots, both in 2D and 3D. As a case study, a fully 3D-printed soft robot manipulator is tested using a control unit and the model predictions show good agreement with the experimental results.
translated by 谷歌翻译
Federated Learning (FL) enables the training of Deep Learning models without centrally collecting possibly sensitive raw data. This paves the way for stronger privacy guarantees when building predictive models. The most used algorithms for FL are parameter-averaging based schemes (e.g., Federated Averaging) that, however, have well known limits: (i) Clients must implement the same model architecture; (ii) Transmitting model weights and model updates implies high communication cost, which scales up with the number of model parameters; (iii) In presence of non-IID data distributions, parameter-averaging aggregation schemes perform poorly due to client model drifts. Federated adaptations of regular Knowledge Distillation (KD) can solve and/or mitigate the weaknesses of parameter-averaging FL algorithms while possibly introducing other trade-offs. In this article, we provide a review of KD-based algorithms tailored for specific FL issues.
translated by 谷歌翻译
We present SLATE, a sequence labeling approach for extracting tasks from free-form content such as digitally handwritten (or "inked") notes on a virtual whiteboard. Our approach allows us to create a single, low-latency model to simultaneously perform sentence segmentation and classification of these sentences into task/non-task sentences. SLATE greatly outperforms a baseline two-model (sentence segmentation followed by classification model) approach, achieving a task F1 score of 84.4\%, a sentence segmentation (boundary similarity) score of 88.4% and three times lower latency compared to the baseline. Furthermore, we provide insights into tackling challenges of performing NLP on the inking domain. We release both our code and dataset for this novel task.
translated by 谷歌翻译